8 research outputs found
A Deep Learning Approach to Landmark Detection in Facial Images
In this paper an alternative approach to landmark detection using cascaded convolutional neural networks is proposed. The cascade consists of three different levels, each with a number of convolutional neural networks. After each layer of the cascade, predictions converge. This results in accurate predictions of landmark locations in facial images. The main advantage over other methods proposed in literature is the integration of face detection and landmark detection in one system. Also, the method is both able to implicitly encode local constraints and shape constraints over the entire image, thus giving it an advantage over regular non-deep learning detection methods. As such, the cascaded neural network substantially outperforms STASM, a state-of-the-art method shape model approach. However, the model does not hold well for data that is not similar to the images it has been trained on
MorphPool: Efficient Non-linear Pooling & Unpooling in CNNs
Pooling is essentially an operation from the field of Mathematical
Morphology, with max pooling as a limited special case. The more general
setting of MorphPooling greatly extends the tool set for building neural
networks. In addition to pooling operations, encoder-decoder networks used for
pixel-level predictions also require unpooling. It is common to combine
unpooling with convolution or deconvolution for up-sampling. However, using its
morphological properties, unpooling can be generalised and improved. Extensive
experimentation on two tasks and three large-scale datasets shows that
morphological pooling and unpooling lead to improved predictive performance at
much reduced parameter counts.Comment: Accepted paper at the British Machine Vision Conference (BMVC) 202
Multi-Loss Weighting with Coefficient of Variations
Many interesting tasks in machine learning and computer vision are learned by
optimising an objective function defined as a weighted linear combination of
multiple losses. The final performance is sensitive to choosing the correct
(relative) weights for these losses. Finding a good set of weights is often
done by adopting them into the set of hyper-parameters, which are set using an
extensive grid search. This is computationally expensive. In this paper, we
propose a weighting scheme based on the coefficient of variations and set the
weights based on properties observed while training the model. The proposed
method incorporates a measure of uncertainty to balance the losses, and as a
result the loss weights evolve during training without requiring another
(learning based) optimisation. In contrast to many loss weighting methods in
literature, we focus on single-task multi-loss problems, such as monocular
depth estimation and semantic segmentation, and show that multi-task approaches
for loss weighting do not work on those single-tasks. The validity of the
approach is shown empirically for depth estimation and semantic segmentation on
multiple datasets.Comment: Paper was accepted at the IEEE Winter Conference on Applications of
Computer Vision 2021 (WACV2021
Benefits of Social Learning in Physical Robots
Robot-to-robot learning, a specific case of social learning in robotics, enables the ability to transfer robot controllers directly from one robot to another. Previous studies showed that the exchange of controller information can increase learning speed and performance. However, most of these studies have been performed in simulation, where robots are identical. Therefore, the results do not necessarily transfer to a real environment, where each robot is unique per definition due to the random differences in hardware. In this paper, we investigate the effect of exchanging controller information, on top of individual learning, in a group of Thymio II robots for two tasks: obstacle avoidance and foraging. The controllers of the robots are neural networks that evolve using a modified version of the state-of-the-art NEAT algorithm, called cNEAT, which allows the conversion of innovations numbers from other robots. This paper shows that robot-to-robot learning seems to at least parallelise the search, reducing wall clock time. Additionally, controllers are less complex, resulting in a smaller search space